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A sequence of N, independent, identically distributed, random variables 
is observed from one of two stable distributions with known parameters. 
The likelihood-ratio test for discriminating between these two distributions 
is found explicitly and performance limitations are determined. 

When the two distributions differ only in location, the likelihood-ratio 
test is sensitive to whether the distribution is nongaussian stable 
(0 < a < 2) when nonlinear soft limiting of large deviations is used, or 
gaussian stable (a = 2) when linear processing is used. 

When the two distributions differ only in scale, the likelihood-ratio 
test is sensitive to whether < a < 2 when nonlinear soft limiting of 
large deviations is used, or gaussian (a = 2) when a chi-squared test 
is used. 

The analysis of the two remaining cases, distinguishing between one of 
two characteristic indices, and between one of two skewness parameters, 
parallels the analysis of distinguishing between one of two scale parameters 
and is only touched upon briefly. 

I. INTRODUCTION 

The problem of classifying a series of observations as coming from 
one of two or more possible classes or hypotheses has received a great 
deal of attention in the statistical and engineering literature. In many 
physical situations, a variety of disturbances corrupt the observations ; 
rather than model each disturbance separately, it is often argued on 
physical grounds that the disturbances add and are independent, and 
the central limit theorem is invoked to model this sum using a gaussian 
distribution. This approach is adequate as long as the sum is not 
dominated by one or a few of the summands; if one or a few of the 
summands does dominate the sum, the disturbances can possibly be 
modeled as a stable distribution, one member of a family of probability 
distributions which includes the gaussian, by invoking a frequently 
overlooked generalization of the central limit theorem. 
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The gaussian distribution has enjoyed great popularity in hypothesis 
testing because it is analytically tractable and because it is the only 
stable distribution with finite variance. Although it may be argued 
that mathematical models with infinite variance are physically in- 
appropriate, this view conveniently overlooks the fact that the gaussian 
distribution is unbounded, which is also a physically inappropriate 
mathematical model. The gaussian model may adequately model dis- 
turbances over a narrow range of amplitudes; an infinite-variance, 
stable-distribution model may adequately model disturbances over a 
larger range of amplitudes. Both distributions may be physically in- 
appropriate mathematical models, but the infinite-variance distribu- 
tion may, in this sense, be the better model. This paper examines 
several stable-distribution hypothesis-testing problems. 

The primary motivation for this work on stable probability measures 
is drawn from a recent statistical analysis 1 of noise on various telephone 
lines. This analysis indicated telephone noise may be adequately 
modeled (on the lines examined) by a sum of sinusoids at various 
frequencies plus a purely nondeterministic random process that is well 
characterized by a stable distribution (either gaussian or nongaussian 
stable) . Since only a small number of lines were examined, this analysis 
is preliminary, awaiting other independent investigations.* 

Indirect motivation for this work is drawn from detecting electro- 
magnetic signals at frequencies of 100 kHz or less. Noise at these 
frequencies is claimed to be nongaussian ; unfortunately, adequate sta- 
tistical evidence to substantiate this claim is lacking, with one 
exception. 2 

A final source of motivation is found in financial problems. Over the 
last decade, a large body of statistical evidence has been amassed which 
indicates that the differences of logarithms of successive equally spaced 
prices of common stocks can be adequately modeled using stable 
distributions. 3,4 

II. OUTLINE OF DISCUSSION 1 

A sequence of N random variables is observed; for simplicity, it is 
assumed they are independent and identically distributed — drawn 
from one of two stable distributions with known parameters (charac- 
teristic index < a J ^ 2, skewness parameter — 1 ^ j8» '^ 1, scale 
parameter y* > 0, location parameter — <» < 8 j < oo j j = 0, 1).* It 



* Applications of this work to removing telephone noise will be presented elsewhere- 
' These results were first presented at the Eighth Annual Princeton Conference on 

Information Sciences and Systems, March 28-29, 1974, p. 405, and at the 1975 
Johns Hopkins Conference on Information Sciences and Systems, April 2-4, 1975, 
pp. 49-51. 

* Both subscripts and superscripts will be used to denote the stable-distribution 
parameters under hypothesis Hj(j = 0, 1); these parameters will be discussed more 
fully in Section III. 
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is well known that the likelihood-ratio test is a decision rule that is 
optimum with respect to either a Neyman-Pearson or Bayes criterion. 5 
Here, the likelihood ratio is found explicitly and performance limita- 
tions of the test are determined. The extension of these results from 
two to M stable distributions is well known and will not be dealt with 
here. 6 

The (log) likelihood decision rule, because of the independence as- 
sumption, takes the following simple form : 

N Hi 

a' = z m % v 

t = l Ho 

l(r) _ ]n P(r i;a \P,y\V) 
l ™ ln p(r,-;a°,/3°,7 ,aT 

where {r«}f are the N observed random variables, drawn from a dis- 
tribution with probability density p(x; a', 0', y>, 8*), and L' is a thresh- 
old. Since Z(r,) can be rewritten as the sum of four functions, 

l( r .) = In E^gigVySjO , ln pfccfi.p.yW) 
K %) p(r<; cfi, ?, y\ 5 1 ) "*" m p(n; a , 0°, y\ 5 1 ) 

, , Pi*; cfi, p, y\ &) piuiaPifry;*) 

pin; «°, (P, 7°, S 1 ) ^ m piu; cfi, P°, y°, SP) ' 

each of which tests for only one different parameter, this suggests 
studying each of these four situations separately. 

Two special cases are examined in detail : when the distributions 
differ only in location and when they differ only in scale. The proba- 
bilities of error of the first and second kind are found for three analyti- 
cally tractable cases (gaussian, Cauchy, and Pearson V) by calculating 
the characteristic function of the log likelihood probability measure 
induced under each hypothesis ; the general case is apparently analyti- 
cally intractable, and quite expensive to tackle numerically at present. 
Exponentially sharp upper and lower bounds on both types of prob- 
abilities of error, and also the total probability of error, can be simply 
derived from the Laplace transform of the log likelihood probability 
measure induced under each hypothesis. These bounds are found 
analytically in three cases, and relatively inexpensive numerical results 
are presented for selected other cases. 

When the two distributions differ only in location, the likelihood- 
ratio test is shown to be extremely sensitive to whether the distribution 
is nongaussian stable (0 < a < 2), when nonlinear soft limiting of 
large deviations is employed, or gaussian (a = 2), when linear process- 
ing is used. When the distribution is nongaussian stable, performance 
is found analytically to be quite sensitive to whether a linear (sub- 
optimum) or likelihood (optimum) decision rule is used: the total 
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probability of error for the linear test behaves asymptotically (JV ^> 1) 
as O (AN 1 *"), while the total probability of error for the likelihood-ratio 
test is upper bounded by exp(— BN + C), where (A, B > 0, C) de- 
pend on parameters of the two distributions and are independent of N. 
(For related work that complements the results in our discussion, see 
the list of references and particularly Refs. 6, 7, and 8.) 

When the two distributions differ only in scale, the likelihood-ratio 
test is extremely sensitive to whether the distribution is nongaussian 
stable when nonlinear soft limiting of large deviations is used, or 
gaussian when a chi-squared test is used. Performance for nongaussian 
stable distributions is extremely sensitive to whether a suboptimum 
(chi-squared) or optimum (likelihood-ratio) test is used: the total 
probability of error for the chi-squared test behaves asymptotically 
(N » 1) as OiFN-t" 12 -"), while the total probability of error for the 
likelihood-ratio test is upper bounded by exp( — GN + H), where 
(F, G > 0, H) depend on parameters of the two distributions and are 
independent of N. 

The analysis of the two remaining cases, distinguishing between one 
of two characteristic indices and between one of two skewness parame- 
ters, closely parallels the analysis that distinguishes between two scale 
factors and is only touched upon here. 

The continuous time analogs of these discrete-time problems are 
studied, where a sample function from one of two stable, stationary, 
independent-increment processes is observed for a finite time interval 
in the second part of this work. In contrast with this work, the analysis 
is simpler, and it is possible to obtain many results analytically in 
closed form. 

Section III deals with various mathematical preliminaries. A brief, 
selective, tutorial overview of the central limit theorem, infinitely 
divisible distributions, and independent-increment processes is pre- 
sented to place this work in perspective (as well as to fix notation). No 
attempt is made to be exhaustive in the discussion. 

The length of the discussion is due to the many special sets of 
parameter values that must be taken into account to be thorough. The 
main reason for this completeness is to adequately cover all cases where 
uncertainty is modeled using a distribution arising from a central- 
limit-theorem type of argument. The main contribution here is the 
results per se, many of which are presented here for the first time, which 
unfortunately often involve either tedious algebraic manipulation or 
machine calculations. It is hoped this will not obscure the surprising 
(at first glance) nature of the results: the quite singular behavior of 
both the log-likelihood-ratio test and (perhaps more importantly) its 
performance, for the gaussian vs nongaussian stable distribution, in 
distinguishing either location or scale. The generalization of these two 
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results to a wide class of infinitely divisible distributions (which 
include the family of stable distributions) is immediate, and is sketched 
at the end of Section IV. 

III. MATHEMATICAL PRELIMINARIES 

The reader is assumed to be familiar with the fundamentals of 
measure theory and probability theory, as found in standard 
references. 9-12 

Underlying the discussion to follow are : 

(i) The notion of a probability space: a triple {ft, A, P], where 
ft is the set of elementary events, A is a <r-algebra of Borel 
measurable subsets of ft, and P is a probability measure on A. 
(ii) The definition of a stochastic process x(t, u) defined on a 
parameter set E (henceforth called time), with t £ E, o> £ ft, 
which is a function mapping the direct product £Xfi into the 
real line, and the associated probability measure induced by 
x(t, Cd). 
(Hi) The measure theoretic concept of absolute continuity of one 
measure with respect to another, and the measure theoretic 
Lebesgue decomposition theorem. 

3.1 Infinitely divisible distributions and independent-increment processes 

In this section, various properties of infinitely divisible distributions 
and independent-increment processes are briefly reviewed. The inter- 
ested reader is referred to the literature for much more information. 12-16 

This tutorial section serves several purposes : 

(i) It gathers together for convenient reference all material on 
stable distributions to be used in Part II. 

(ii) It fixes notation. 

(m) It emphasizes the central role played by stable distributions 
in understanding both the central limit theorem and the L6vy 
decomposition of the infinitely divisible distributions. 

(iv) Finally, it alerts the reader to the rich structure and variety 
of infinitely divisible distributions, in general, and stable 
distributions, in particular, in the hope that they will find 
greater use in modeling uncertainty. 

The characteristic function of a (first-order) probability distribution 
P(x)' is defined as 

C x (v) = J e™dP(z) = E(e { ") a.s. 



'Upper case P(-) will denote a probability distribution, while lower case p(-) 
will denote the associated probability density function ; all probability distributions 
examined here in any detail are absolutely continuous with respect to Lebesgue 
measure. 



PROBABILITY MEASURES— I 1129 



It can be shown that two probability distributions are identical if and 
only if their characteristic functions are identical (Ref . 14, page 28) ; 
thus, there is a one-to-one correspondence between characteristic func- 
tions and probability distribution functions. A random variable is 
said to be infinitely divisible if, for every natural number n, the random 
variable can be represented as the sum of n independent identically 
distributed (i.i.d.) random variables, or equivalently if its charac- 
teristic function can be written as 

CM = [C x {v,n)-]» n = 1,2, ••-, 

where C x is the characteristic function of some probability distribution 
which may depend on n. Two well-known examples of infinitely 
divisible random variables are the gaussian [^taking values on 
(—oo, oo )] and the Poisson (taking values at nonnegative integer 
multiples of h) : 

/oo J 
e" v . exp{ - (x - m) 2 /2a 2 \dx 
V2^ 

= exp(imv — ^o-V 2 ) 
Poisson: C x (v) = T) ^- (e" h ) k = exp[X(e i -"' - 1)]. 

De Finetti conjectured that any infinitely divisible distribution could 
be written as the convolution of a gaussian and a generalization of the 
Poisson ; the resulting characteristic function can be written as 

In C x (v) = imv - \oW + f (e itu - l)dF(u), 

where the measure F{u) specifies at what points the Poisson variable 
takes on nontrivial values. However, this conjecture was shown to 
hold only for a subset of the infinitely divisible distributions by Levy ; 
if one desires a canonical form of the characteristic function of an 
infinitely divisible distribution, then the following remarkable theorem 
can be proved (Ref. 13, page 76). 

Theorem (Levy) : Any infinitely divisible characteristic junction can be 
uniquely written in the canonical form 

In C x {v) = ibv - |oV + f°" (e" u - 1 - j-^j^ ) dv-(u) 

where b is a location parameter (— °o < 8 < °o), a 2 > is the variance 
of the gaussian component, and (v-, v+) are called the Levy measure of the 
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generalized Poisson distribution. The conditions the Levy measure must 
satisfy are (i) v- and p + are nondecreasing on the intervals ( — « , 0) and 
(0, oo ), respectively, (ii) *_(—«>) = v+(*>) = 0, and (Hi) for every 
finite e > 0, 

I uHv-{u) < «j / u 2 dv+(u) < ». 

Some examples now follow : 

Example 1 (Poisson) : 8 = ivh\/(l + h 2 ), a 2 = 0, i»_ = 0, - oo < w < 

_ J -X < w < A 
" + ~ I h ^ u < oo ; 
.-. lnC x (w) = X(e i8A - 1). 

Example 2 (Cauchy) : a 2 = 0, 8 = 0, 

- oo < w < 



r|tt| 

~~ c 
v+ = < U < oo ; 

.-. lnC x (v) = i8v — c\v\. 
Example 3 (Gamma) : a 2 = 0, v- = 0, — oo < w < 

r y i + w 2 

d»/ + (w) = pe-« u d(lnM); 

Most of the attention here will be focused on one particular class of 
infinitely divisible distributions, the stable distributions. 

Definition: A probability distribution is said to be stable if, for all 
ai > 0, a 2 > 0, &i, 62, there exist constants a > 0, b such that 

P(a& + bO*P(a 2 x + 62) = P(ax + b), 

where * denotes convolution. In other words, stable distributions are 
closed under the action of the group of linear affine transformations on 
the real line. 

An important reason for examining stable distributions is found in 
the central limit theorem (Ref. 13, page 162; Ref. 15, page 168): 

Theorem : P(x) is a limiting distribution for a sum of suitably scaled and 
translated, independent, identically distributed, random variables if and 
only if P (x) is stable. 
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In many practical problems, a large number of independent dis- 
turbances add and introduce uncertainty in a measurement. To 
analyze the effects of uncertainty, it is often convenient to replace this 
sum by its limiting distribution, which must be a stable distribution. 
The reader is referred to the bibliography for references on exactly 
what conditions govern the limiting distribution being gaussian vs 
nongaussian stable (Ref. 12, pages 171-190; Ref. 15, pages 165-169). 

Stable distributions are infinitely divisible; the associated Levy 
measures can be shown to be v-(u) = c_|w| -a , v+(u) = —c+u~ a 
(Ref. 13, pages 164-168; Ref. 14, pages 128-133). Requirement (i) 
that the measure be nondecreasing leads to a > 0, while the final 
requirement (Hi) forces a < 2. Substituting this into the canonical 
representation of the characteristic function of an infinitely divisible 
distribution and explicitly evaluating the integral over the Levy 
measure results in the following theorem : 

Theorem (Ref. 18, page 164; Ref- 14, V a Q e 186) : The characteristic func- 
tion of a stable distribution can be expressed as 



lnE(e ixv ) = 



— 7 1 v\ * 1 + *j8 -j— r tan f — J + iSv a j± 1, 

— 7 1 v | 1 -f- i/3 -j— r — In | yv | + i8v a = 1, 

L l u l v J 



where < a ^ 2, -1^/3^1, 7>0 (7 = c a ), —00 < 5 < *> . For 
0<a<l,|8 = c_ — c+/c- + c + ; for 1 ^ a ^ 2, /3 = c+ — C-/c+ + c_. 
Note that for a = 2, the characteristic function, as a complex-valued 
function of v, is C°°, but for 1 < a < 2, it is only C 1 , and for < a S 1 
is only C°. 

For fixed /3 (0 5* 0), the characteristic function is discontinuous (as 
a function of a) in the neighborhood of a = 1. One approach to this 
problem is to rewrite the characteristic function (a j* 1) as 

In^(e fa ») = -7M°Tl +#-Atan(™)l 

. . / . . „ . ira „ , 7ra \ 

+ w ( 8 + 7/3 tan — - 7/3 tan -y J 



= — y I v I a + t7/3u tan — [1 — | v \ a ~ 1 j 



to (5 + 7/3 tan ^ J 



+ t v ( 8 + 7/3 tan 

If a new parameter 5' = 8 + 7/3 tan (ira/2) is defined, then for /3 fixed 
lim tan ^ [1 — \v\ "- 1 ] = - In | v \ . 

a-1 ^ 7T 
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By inspection, this form of the characteristic function is not discon- 
tinuous in the neighborhood of a = 1. 

Since the characteristic function is in L\{— «>, <x>), all stable dis- 
tributions are absolutely continuous with respect to Lebesgue measure, 
and have analytic probability density functions. Four parameters com- 
pletely specify a stable distribution : 

(i) a, the characteristic index of the stable distribution P (X ; a, /3) is 
associated with the asymptotic behavior of P(X; a, /3). For 
-1 < j8 < 1, < a < 2, 

lim |X|«P(-X) = k- > 0, lim X«[l - JP(X)] = k+ > 0. 

X-»— oo X—*> 

For fi = — 1 (a similar argument holds for /3 = +1), Lipschutz 16 and 
Ibragimov and Linnik (Ref. 17, pages 62 to 66)* have shown that for 
1 < a < 2, 

P(X) = 0{k(a)\X\ a iw-°)expl-c(a)\X\ a i°- 1 l} as X -> - « 

lim X a [l - P(X)] = & + > 0, 
.a: -co 

while for < a < 1, 

P(X) = 0\k(a)X a iw-* e*v[-c{a)X- a i 1 - a ~]) as X J, + 
lim X«[l - P(X)] = k+ > 0, 

X-oo 

where k{a), c(a) are constants which depend only on a. For the asym- 
metric Cauchy probability density function, it can be shown (Ref. 17, 
pages 57 to 60) that 

p(X;o-l, 

= -l) = o[~exp(| \X\ -^exp(7r|X|/2)^l X -► - co 

lim p(X; a = 1, = -1)X 2 = /c + > 0. 

JC-»=o 

(n) /3 characterizes skewness of the distribution: if /3 = the dis- 
tribution is symmetric about x = S. Otherwise, 

l-P(X;«,/3)-P(-X;a,fl) = _ 
i™ 1 - P(X; a, |B) + P(-X; a, 0) P 

P{-X;a,P) _ 1+fl 
™.l-P(X;a,|8) 1 — jS" 

For 1 < a < 2, the distribution is skewed to the left for — 1 ^ /3 < 0, 
since P(8) < 1 — P(5), with the degree of skewness increasing as /3 



* Note typographical errors in eqs. (2.4.30) and Theorem (2.4.7), of Ref. 17. 
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decreases. It suffices to consider varying /3 over one half its range be- 
cause from the characteristic function it follows that the probability 
density p(x) obeys the relation 

p(x; a, fi, 7, 5 = 0) = p( — x; a, -/3, y, 8 = 0). 

(Hi) 7 (or c 4 T 1/a ) is a measure of the dispersion or spread of the 
distribution. 

(iv) 8 is a location parameter, and for 1 < a = 2, 8 is the mean. 

Only three analytic closed-form expressions for stable probability 
density functions are known at present : 

Gaussian (a = 2, — 1 ^ /3 ^ 1) : 

p(x) = vib exp [" (Hr)'] _0 ° < * < °° ; 

Cauchy (a = 1, /3 = 0) : 

P(z) = - [(a: - 5) 2 + c 2 ]- 1 - oo < a; < «> ; 

7T 



Pearson V (a — $, /3 — — 1) : 



P (*)=j c i(vT exp [-2<^»] 



x < 8 



and its conjugate density 

p(x;a = i,/3 = 1,7,5 = 0) = p(-x;a = §, = -1, 7,5 = 0). 

Series expansions are known for the remaining stable density functions 
(Ref . 14, pages 138-148) : 

p(x;a, p, 7 = 1, 5 = 0) 

, . (-l)*r(* + l) , 

= - L TT^ ^z^sin^ (0 - a) K a = 2, 

p(x; a, P, 7 - 1, 8 - 0) 

1 " (-l)T(/ca + 1) _ ,_. . far ,_ N n / / i 

= - X — n ar a * l sin -jr- (0 — a) < a < 1, 

p(x;a,P,y = 1, 8 = 0) 



[ f" t k [wa. (1 + j8)*}e-<* /T > ,ln '<tt"r 



= - Z ^Cr"**! / ^< yitl ( 1 i- J)l}e- {2:ih!tnt dl\ a = 1, 



where 

tan (0x/2) = tan (xa/2), and a: > 0. 



* For asymptotic expansions for a = 1, see Ref. 17, Theorem 2.4.3 and Ref. 18. 
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The reader can check that the series for a = 2 reduces to the series for 
the gaussian, and the series for < a < 1 and \i3\ = 1 are zero on a 
half line (cf. Pearson V). For (0 < a < 1, -1 < < 1) and 
(1 ^ a £ 2, —1^/3^1), stable probability densities have support 
on ( — oo , °o ) . The series expansion for the density for < a < 1 can 
be used as an asymptotic expansion for the density for 1 < a < 2 for 
| @ | ^ 1. It can be shown from the characteristic function directly that 
all stable distributions are unimodal (Ref. 13, pages 158 to 161 ; Ref. 
17, pages 66 to 76). 

Figure 1 is a plot of various stable probability density functions for 
fixed a (1 < a < 2) and several /3 ; for a near two, it is quite difficult to 
distinguish symmetric (/3 = 0) and asymmetric stable distributions. 
Figure 2 shows that around the mode, all stable distributions appear 
roughly gaussian, for 1 < a < 2 (note the logarithmic scale). 

For a in the neighborhood of two, the gaussian and nongaussian 
stable distributions are virtually identical around their mode, and it is 
only in the tails of these distributions that the differences are pro- 
nounced. One crude measure of the point at which the gaussian and 
nongaussian stable distributions diverge is the point at which the first 
term in the asymptotic series (a < 2) equals the gaussian density: 
for a = 1.90, 1.95, 1.99, this occurs at 3.342, 3.635, 4.158 gaussian 
standard deviations, respectively. 

One reason stable distributions have attracted little attention in the 
mathematical modeling of uncertainty is found in the theorem from 
Ref. 14, page 169: A stable distribution with characteristic index a has 
all absolute moments of order p, < p < a < 2: E(\x\ p ) < °°. Con- 
versely, E(\x\ p ) does not exist, i.e., it diverges, for p ^ a, a < 2. 

This suggests (albeit heuristically) that stable distributions may find 
application in modeling uncertainty when, as the number of observa- 
tions increases, for < a < 1, both the sample mean and sample 
variance "wander erratically," being dominated by one or a few ob- 
servations, while for 1 < a < 2, the sample mean stabilizes but the 
sample variance does not £cf. Refs. 1, 2, 3, 4]. 

The generalization of these ideas from discrete time sequences of 
independent, identically distributed, random variables drawn from an 
infinitely divisible distribution to continuous time sample functions 
of an independent increment process is clear. The characteristic func- 
tional of a stationary independent increment process can be uniquely 
written as 



InEte^w-'W] 



fe iDU - 1 - x + u2 J dv-(u) 
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Fig. 1— Stable probability density functions [a = 1.1(0.2)1.7, = -0.75(0.25)0.0]; 
scale factor c = 1.0; location parameter S = 0.0. 
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for ^ s < t < T. The parameters 5, a 2 , and (v-, v+) have been aenneu 
already. In words, any independent increment process can be decom- 
posed into 

(i) A singular piece, called the drift, specified by 8. 




Fig. 1 — (continued) 
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Fig. 2 — Stable probability density functions (semilogarithmic) [a = 1.1(0.4)1.9, 
p = —0.5(0.5)0.5]; scale factor c = 1.0; location parameter 5 = 0.0. 
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(ii) A gaussian component, a component with continuous sample 
paths that have unbounded variation with probability one 
(w.p.l), specified by a 2 . 
(Hi) A generalization of the Poisson process called a jump process, 
2 



o - 



-2- 



-4 - 



a- -6 



-8 



-10 



-12 



14 



CHARACTERISTIC INDEX a 
SKEWNESS PARAMETER = -0.50 




- 



-2- 



-4- 



X 

i -6 



-8 - 



-12- 



CHARACTERISTIC INDEX a 






SKEWNESS PARAMETER = -0.75 




- a- 1.1 r - x 
























\\ 




/ ll 


\\\ 




1 ' 


\\ 




i i 




v^ 








\\ *». 


/ // 




\ \ "^ 


.' 1 




\ \ '**"■•». 


' \ 




\ \ 


*" / / 




\ s 


.-" / / 




\ V 


/ 1 




\ V. 




\ "V 


/ 1 
s 1 




\ 


s / 






s / 






s / 






^ / 






— >» / 












/ 






1 1 


1 


1 1 



-30 



-20 



20 



Fig. 2 — (continued) 
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with sample paths that are constant except for simple jump 
discontinuities at random times with random amplitudes, 
specified by the Levy measure (v_, v+). 

A (separable) pure jump process, a stationary independent increment 
process with no gaussian component, has sample functions that are of 
bounded variation* with probability one if and only if 



/ \u\dv-(u) + / udv+(u) 
J-i Jo+ 



< 



An example of an independent increment process with bounded varia- 
tion (w.p.l) is a stable independent increment process (0 < a < 1) 
while stable independent increment processes (1 ^ a. ^ 2) have un- 
bounded variation (w.p.l). The intuitive meaning of the Levy measure 
is that first proposed by De Finetti: the Levy measure specifies the 
density of the amplitudes of the jumps of the Poisson process, provided 
the process sample paths are of bounded variation (w.p.l). 

By allowing 5, <r 2 , and (i>_, v+) to depend upon time, a time- varying 
generalization of infinitely divisible distributions or nonstationary in- 
dependent increment processes is obtained. By examining nonanticipa- 
tive functionals of either a discrete time sequence of i.i.d. random 
variables drawn from an infinitely divisible distribution, or a con- 
tinuous time independent increment process, a wide variety of Markov 
processes are derived. Thus, the generalizations of the results presented 
here to many other situations may sometimes be immediate. The 
richness of this class of random processes suggests these results may 
find wide application. 

Historically, the mathematical study of independent increment 
processes concentrated first on the gaussian case, then on the stable 
case, and finally on the general case. To date, most of the engineering 
literature has concentrated on the gaussian case or the purely Poisson 
case, with the notable exception of Frost. 19 It is hoped this work will 
suggest promising avenues of constructive research by studying the 
stable case, as well as shedding light on some of the quirks of the 
gaussian case. 

IV. DISCRETE TIME DETECTION OF TRANSLATES OF STABLE MEASURES 

One of two sequences of independent, identically distributed (i.i.d.), 
stable, random variables is observed, under one of two hypotheses 

(#o, #i): 

Hi r k = * + n k l £ k £ Nt 

Ho r k = s° + n k 



' The variation of a function f(t), < t < T, is defined as sup LfL |/&+i) - /('•) I 
where the supremum is over all possible partitions of the interval [0, 7'] : = U < U 
<---<t N = T. 
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The observed or received sequence is denoted {r k }i, while {n k }i is a 
sequence of i.i.d. stable random variables with known parameters 
(a, 0, 7, 5 = 0) ; both s 1 and s° are known. The a priori probability of 
Hj is denoted rj (j = 0, 1). (The extension of allowing s\ s° to depend 
on k is immediate and is not dealt with here.) 

The measures induced by {r k }T under H and H t are clearly not 
mutually orthogonal. Two cases occur: for (0 < a < 1, — 1 < /3 < 1) 
and (1 2s a ^ 2, — 1 ?£ |8 ?£ 1), the stable measures have support on 
the whole real line, and hence are equivalent. For (0 < a < 1, /J = 1 
or — 1), the stable measures have support on a half line, and hence one 
measure is absolutely continuous with respect to the other but not 
vice versa: the supports of the two measures overlap except for the 
interval [s°, s 1 ). In either case, since the measures are not mutually 
orthogonal, the decision rule, which as is well known minimizes both a 
Bayes criterion as well as a Neyman-Pearson criterion, is the likeli- 
hood-ratio test. 5 The goal is to find the exact form of this test, and 
characterize its performance. * Performance here means calculating the 
probability that H i is chosen given that H is true, and the probability 
that H is chosen given that Hi is true; these are called probabilities 
of error of the first and second kind, and are denoted P i0 and Poi, 
respectively. A quantity which is also of interest is the total probability 
of error, defined as (toPio + ttiPoi) = Pe- 

4.1 The likelihood ratio test 

The structure of the optimum detector is handled in two separate 
cases. First, when (0 < a < 1, - 1 < < 1) or (1 ^ a 5S 2, 
— 1 ^ fi ^ 1), the likelihood ratio is always strictly positive and finite, 
and is 

N n ( r . — JC\ Hi 

where p n (-) is the probability density of n k . An equivalent test is to 
compute the log likelihood ratio, 

A' = lnA = £l(r<) |'lnL = L', 

i=l Ho 

where 

i (r .) =m Pn(r,-^) 

and this can be explicitly calculated using the series expansions de- 
scribed earlier. Before doing so, it is worthwhile to examine two 



A discussion of the power of this test (or any other test) is deliberately omitted. 
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analytically tractable cases : 
Gaussian (a = 2, — 1 ^ ^ 1) : 

p n (x) = j e -* 2 /4c* _ oo < a; < oo ; 

*4irc 

•"• Z(ri) = ~ 47 2 [(r *' ~ Sl)2 ~ (r< ~ S ° )2] 



ol Q iV 

A' = lnA = S 



2c 2 



f « - O cw -<*w I m .*-*• 



The log likelihood test can be implemented using only linear process- 
ing. The rule has the interpretation of comparing an energy-like 
quantity, the received signal suitably translated and squared, with a 
threshold. Equivalently, the test defines a hyperplane in R N , and de- 
pending upon which side of the hyperplane (r j, • • ■ , ry) lies, H i or // 
is chosen. All of this is well known (see Ref. 5, pages 94-97 and 163- 
173). 

Cauchy (a = 1, = 0) : 

n 

Pn(x) = - (x 2 + C 2 ) -1 — oo < x < oo ; 

7T 



. . ^r,; - in _ ., , 



(r< - s 1 ) 2 + c 2 
W />• — •j ') 2 -I- r- 2 fll 

••• A'-fem fc.ij. + S j.toL-g. 

Unlike the gaussian case, the Cauchy log likelihood detector operates 
nonlinearly on the observation. A straightforward Taylor series ex- 
pansion of the log likelihood about r t = ^ (s 1 + s°) shows that for small 
perturbations about this point the log likelihood is linear in the perturb- 
ing quantity. On the other hand, for large excursions in any one 
observation, 

r. — *0 
»1, 



Ti 



Ti 



»1, 

this one term in the sum behaves as 0{n l ) or, in other words, very 
large excursions in the received signal are essentially (but not entirely) 
discarded; this type of behavior will be called soft limiting. Only for 
A'' = 1 does this test reduce to finding a hyperplane and determining on 
which side of the hyperplane the observation lies in order to choose 
H x or #o. 
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The cases (0 < a < 1, -1 < < 1) and (1 £ a £ 2, -1 < fi < 1), 

can now be examined ; it is a straightforward exercise to substitute into 
the log likelihood the series expansions for stable probability density 
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Fig. 3 — Representative log likelihood functions (s 1 =+10, s° = 0) (a fixed, 
varying); scale factor c = 1.0; location parameter 5 = 0. 
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Fig. 3 — (continued) 

functions. Figures 3 and 4 show various representative log likelihood 
ratios P(r,)] for (1 < a < 2, -1 < < 1) with a fixed a and /3 
varying; Fig. 5 shows the same log likelihood ratios as in Fig. 4 with 
/? fixed and a varying. Similar results hold in the remaining cases 
(0 < a < 1, -1 <fi < 1). 

Three points are emphasized here. First, the structure of the optimum 
(log likelihood) detector is very sensitive to whether the underlying 
distribution is gaussian or nongaussian stable; this is not surprising, 
because small perturbations away from a = 2 result in a singular 
perturbation in the probability density function.* Second, when the 
observation is in a neighborhood of |(s° + s 1 ), an identical Taylor series 
argument, as used in the Cauchy example, is applicable, and small 
perturbations about this midway point result in linear perturbations 
about the corresponding log likelihood point. Third, when large 
excursions occur, 



Ti - S l 



»1, 



Ti — S' 



»1, 



the (log) likelihood for this term behaves as 0(r, '), which follows from 
asymptotic expansions. 



* However, stable distributions in the neighborhood of a = 2 are all close with 
respect to the topology induced by any reasonable metric, e.g., Prokhorov's metric. 
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The first two points in the preceding discussion hold for (1 ^ a < 2, 
|/3 1 = 1). The third point must be slightly modified (assume now 
(j = — 1, since a similar argument follows immediately for = 1) : 
l(n) ~ 0(rr l ) for r, > 0, but for r< < 0, J(r € ) ~ 0(- N 1 '"" 1 ) (cf. 
gaussian case) (1 < a < 2), while for r< < 0, a = 1, 

Z(r,-)~0[-exp(7r|r,-|/2)]. 

It remains to consider {n A }f, a sequence of i.i.d. stable random 
variables with (0 < a < 1, |/3| = 1). Assume from here on = —1, 
s 1 > s°. The likelihood ratio is thus zero or strictly positive and finite, 
and the log likelihood is either minus infinity or finite. First, consider 
the Pearson V distribution as an example : 

Pearson V (a = \, = - 1) : 

. . Ic-^dYexpt-c^xl x^O 

Vn{x) = < V27T \ C/ 

s<0; 



n ^ s 1 > s° 

s 1 > r< ^ s° 



.-. l( ri ) = \ 2 ln \r i -s°) 2Lr<-s 1 u - s° J 

n ^ s 1 > s° 

for all *', 1 £i £ N, 

A' = - oo (choose #o) if s 1 > r,- ^ s° 

for some i, 1 ^ i ^ N. 

If all the received signal samples are greater than s 1 , the optimum 
test is to compute the log likelihood and compare it with a threshold to 
choose Hi or H . Note that for (r» - s l )/c » 1, l(r f ) decays asymptot- 
ically as 0(ri _1 ), and thus large deviations are weighted lightly. For 
n > s\ (r, — s 1 ) « c, Z(r.) ~ (r,- — s 1 ) -1 - If one or more observations 
fall in the interval [s°, s 1 ), the optimum rule is to choose H . 

The remaining cases (0 < a < 1 and = — 1) can be treated in an 
identical manner, using the series expansion for the densities. The im- 
portant points are (i) the optimum detector is fundamentally non- 
linear; for {n - s x )/c » 1, l(n) decays as 0(r, _1 ), (it) if any observa- 
tion falls in the interval [s°, s 1 ), the optimum strategy is to choose H , 
(Hi) for n > s\ \n - s 1 ! « c, l(ri) ~ 0[(r< - s 1 )- (1/1 " a) ]. 

4.2 Performance limitations 

To complete the solution of the problem, the probabilities of error 
of the first and second kind must be calculated. This appears to be 
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quite difficult in the general case of an arbitrary stable distribution 
and bounds are developed in Section 4.3. In this section the per- 
formance of the optimum (log likelihood) detector is found explicitly 
for the three analytically tractable stable distributions to illustrate the 
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Fig. 4 — Representative log likelihood functions (s 1 =+10, s° =—10) (a fixed, /3 
var^ng) ; scale factor c = 1.0; location parameter 5 = 0.0. 



1146 THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1976 



V 



10 



-15 



CHARACTERISTIC INDEX a = 1 .9 
SKEWNESS PARAMETER 




-30 



-10 



30 



Fig. 4 — (continued) 

problems that must be addressed in the general case. The approach 
adopted is to calculate the characteristic function of the log likelihood 
probability measure induced under either Hi or H . 

Gaussian (a = 2, — 1 ^ ^ 1) 

Section 4.1 showed that the log likelihood ratio is 



A' = - 



4c 2 



N 



E [(r,-s0 2 - (r,-s°) 2 ] , 



t=i 



and since the log likelihood is a sum of i.i.d. random variables, its 
characteristic function can be found by using elementary Fourier tech- 
niques. The results are : 

Nis 1 - s ) 2 r . 



ln^e^'ltfO = 



4c 2 



[iv — y 2 ] 



\nE(e^'\H ) = N(S \ c2 S ° )2 [-« - »■]. 

Using the Fourier inversion lemma, the density of the log likelihood 
under either hypothesis can be found in closed form to be 



p(A'|ff,) = -jL= f expl-(A f - O 2 /^' 2 ] 

V47TC 



- 00 < A' < oo 

3 = 0, 1, 
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Fig. 5 — 'Representative log likelihood functions (s 1 =+10, s° = —10) (a varying, 
/3 fixed). 
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where 

The probabilities of an error of the first and second kind are 
P 10 = Pr [chooser \H true] = /"" p(A'\H )dA' 

Poi = Pr [choose H 1 Hi true] = 1 — ■= erfc f — jp— * ) 

-H 1+ (^M-(^')1 

where erfc (•) is the complementary error function (Ref. 20, eq. 7.1.2) 
and i^i is a hypergeometric function (Ref. 20, eq. 7.1.21; see also 
Slater, Ref. 21). 

Cauchy (a = 1, = 0)' 

It was noted previously that the log likelihood ratio can be written as 

A " ,?i ln (Fi - W + * 

The characteristic function for the log likelihood can be found just as 
for the gaussian case : 

" L ' ' " L GXP V h ^ (ry " *Y + c 2 J Ui Fi ~ *)* + * 

= fi /" [fo - s ) 2 + c 2 ]'">-[(ry - s 1 ) 2 + c 2 ]-"- 1 f |) dry 

= { J" [(» + A) 2 + c 2 ]'"[(a; - A) 2 + c 2 ]-"- 1 (|) dx}" , 

where 

A = J(a» - s°), x = ry - |(a» + s°). 



* The following analysis was suggested to the author by S. O. Rice; any errors in 
the development here are the responsibility of the author alone. 
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It now helps to realize 

(x ± a)» + e = (* 2 + a 2 + <*) (1 ± x2+ 2 ff +c2 ) 

so that the characteristic function can be written as 

' \ ir ,/ z 2 + A 2 + c 2 J 

VW V ' x 2 + A 2 + c 2 J 

Only even powers of (m -f- n) contribute to the integral. This observa- 
tion can be combined with the definition of the beta function (Ref. 20, 
eq. 6.2.1.) to show that 

/ m + n + l \ 

-»>' 2 (-l)» \ 2 J 






ir T(m + n + 1) 
Substituting (m + n) = 2Z, and using the identity (Ref. 20, eq. 6.1.18) 

T(l + j) 2V^ 
T(2Z) 4T(0 ' 

results in the final form of the log likelihood characteristic function 
assuming H x is true, 

^'l^ = [-VSlo(^)' 

-t^ 2 F 1 (-2Z,^ + l;iv -21+1; -1)Y • 
The term 

r(-w + 2Z) 



(-iy)2i = 



r(-w) 



is standard notation for Pochhammer's symbol (Ref. 20, eq. 6.1.22). 
A similar expression results for the characteristic function of the log 
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likelihood, assuming the other hypothesis is true : 



FW\H\-\ 1 ^ T (1 + ivhl 



(^f^) l ^(-2l,iv;-iv- 21; -!)]"■ 



Since these series converge for all v (— °o < v < oo), as well as for all 
(finite) values of A and c, the Fourier inversion lemma guarantees that 
a unique inverse to these transforms exists, and thus in principle the 
density of the log likelihood under either hypothesis is known and the 
probabilities of error of the first and second kind can be calculated. 
Numerical results are presented in a later section that were arrived at 
in exactly this manner. 

Several additional observations can be made. For N = 1 the log 
likelihood is a random variable whose distribution has compact support 
on the interval 

to fH^-A ,_/^+j;\ Vtf+7+A 

VA 2 + c 2 + A \ 2 / VA 2 + c 2 - A 

and thus the support of the log likelihood distribution for any finite 
number of samples, say N, is on the closed interval 

;Vln ^±g- A <A--W^Wln ^l±g +A . 
VA 2 + c 2 + A \ 2 / V A 2 + c 2 - A 

Since the log likelihood distribution has compact support, it is well 
known (Ref. 22, p. 121) that its Fourier transform has support on the 
entire real axis. The second observation concerns the asymptotic 
(o » 1) behavior of the characteristic function of the log likelihood. 
Since the saddle points of the log likelihood characteristic function 
are at ±Va 2 + c 2 , stationary phase arguments 23 show that asymptot- 
ically (v » 1) : 

m*»m - [/>(**&^)(!)<^#t-J 

r c 2 ... . ... i / . . v a 2 + c 2 + a , .tt\ 

~ . (A 2 + c 2 )m exp ( iv In , h i -. ) 

L V^AU I V VA 2 + c 2 - A 4 / 

so that asymptotically the characteristic function decays as | V \ ~ NI2 . A 
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N 

= n 

i = l 



similar result holds for the log likelihood characteristic function as- 
suming H is true. 

An alternate approach is to calculate the Mellin transform of the 
likelihood probability density (for N = 1), then raise it to the Nth. 
power and find the inverse transform; this was investigated without 
success. A direct approach, convolving the probability density of the 
log likelihood with itself N times, was also attempted; the resulting 
integrals were intractable. 

Pearson V (a - J, /S - - 1) 

Assuming Hi is true, the characteristic function of the log likelihood is 

Eie^'lHi) 

=/;-/>M[-H^i) 

-«(_» l —)]f\ L-( r .ui*y 

• eX p("20^))*vl 

(/>p [- l i! " n (^l) - f (;rb - ?rb)] 

where A = Ks 1 — s°), z = r> — §(«* + s°). All attempts to simplify 
this expression were unsuccessful. Stationary phase arguments show 
that asymptotically (v >>> 1) 

*V»\**~ ([>£*• <*p(«, + if)]+0(i)j\ 

where (fci, fc 2 , A;a) are complicated functions of (c, A). 

An attempt was made to find E(e ivX ' \H ), assuming no observation 
occurred in the interval (s°, s 1 ) ; this approach encountered the same 
problems as finding E(e ivK '\H\), and was unsuccessful. 

It is worth noting that the log likelihood has only one maximum on 
the interval (s 1 , °o), for E(e ivS -'\H,)(j = 0, 1), and hence only one 
stationary point enters into the stationary phase asymptotic expression 
for E(e ivX ' \Hj). It can be shown this behavior is typical of any asym- 
metric ( |/8 1 = 1) stable distribution. In contrast, the log likelihood has 
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two maxima for any stable distribution ( — 1 < /3 < 1), and hence two 
stationary points (cf. Cauchy). 

Neither the use of Mellin transforms (instead of Fourier transforms) 
nor convolving the log likelihood density with itself N times made the 
problem any more tractable. 

In the case of an arbitrary stable distribution, it appears quite 
difficult to find the density of the log likelihood by calculating the 
characteristic function of the log likelihood probability measure in- 
duced under either Hi or H , because only series expansions are known 
at present for stable probability density functions (except for the three 
cases covered here). Even resorting to numerical approximation tech- 
niques poses some quite difficult problems : for < a < 2, — 1 ^ ^ 1 
(as for the Cauchy and Pearson V distributions) the log likelihood 
characteristic function has its support on the entire axis, and oscillates 
and decays asymptotically as O[(e ivwo /^) N '] from stationary phase 
arguments.* To accurately approximate numerically the probabilities 
of error of the first and second kind from the log likelihood characteristic 
function, the characteristic function must be approximated and stored 
at a great many frequencies, and the total cost (especially due to 
storage) can be quite high. Furthermore, one would like to carry out 
calculations for many different values of (a, /3, 7, 5). The storage cost 
plus the large number of parameter variations often desired can make 
this program quite expensive at present. 

4.3 Analytic performance bounds 

Because of analytical and numerical problems encountered in ex- 
plicitly calculating the probabilities of errors of the first and second 
kind, as well as the total probability of error, bounds on these quantities 
were investigated. 

Let Pi and P be probability measures defined on the same measure 
space (ft, A). For < q < 1, define 

«^-(so'(9r* 

where m is any measure defined on (ft, A) such that n » Pi, m » Po- 
(An example of such a m is m = Po + Pi.) This definition of h q is seen 
by inspection to be independent of n. Define 



H g (P h Po) = f o dh g (Pi, Po) 



* Different contours of integration (e.g., path of steepest descent) were investigated 
without success. 
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as the Kakutani inner product of P with Pi (Ref. 24) ; the classical 
Hellinger integral is a special case of the Kakutani inner product, and 
is denned as H\(P X , P ). It is known that 

^ H q (P h P ) ^ 1, 

with H q = 1 iff Pi = P a.e. The Kakutani inner product can be 
thought of intuitively as the amount of "colinearity" or "overlap" of 
two probability measures, with the larger the Kakutani inner product, 
the larger the "overlap." A number of useful properties of the Kakutani 
inner product are summarized in the following easily proven lemma 24,25 : 

Lemma: (1) Po and Pi are mutually orthogonal (denoted P _l_ Pi), 
«=> H q (P , P t ) = ** h q (P , Px) = 

(2) If < q < 1, H q (P , Pi) is continuous in q. Four cases 
determine the behavior of H q (P , Pi) at q = 0, 1 : 

(2a) If Po and Pi are equivalent, then H q (P , Pi) is continuous 

at q = and q = 1. 
(2b) If P is absolutely continuous with respect to Pi but not vice 

versa, then H g (P , Pi) is continuous at q = 1 but not at 

q = 0. 
(2c) If Pi is absolutely continuous with respect to Po but not vice 

versa, then H q (P , Pi) is continuous at q = but not at 

q - 1. 
(2d) If P and Pi are neither mutually orthogonal nor equivalent, 

then H q (P , Pi) is discontinuous at q = 0, q = 1. 

(3) H q (P , Pi) and its logarithm are convex functions, < q 
< 1. The convexity is strict iff (dPi/dPo) (x) is not constant 

for all x G supp(P ) (~) supp(Pi). 

It is instructive to rewrite H q (P , Pi) in two different ways to ex- 
plicitly show the relationship between the log likelihood functional and 
the Kakutani inner product : 

(i) # (P ,Pi) = Jexp[qln(dP 1 /dP )}dP 

= £{exp [gin (dP 1 /dP )2\H } > 
(ii) H q (P , Pi) = f exp{( 9 - 1) In {dPi/dP*))dPi 

= £{exp l(q - 1) In (dP 1 /dP )l\H l }. 

(i) and (ii) are the Laplace transforms of the log likelihood probability 
density (also called the moment generating function of A), evaluated 
at q and (q — 1), and assuming H and Hi are true, respectively. It is 
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196-197). Using Holder's inequality, it is straightforward to show that 
the logarithm of H q and, hence, H q itself, are convex functions of q, 
< q < 1. 

Chernoff 26 was apparently first to use H q (P , Pi) (where P « m, 
P x « n, n = Lebesgue measure) to upper bound the probabilities of 
error of the first and second kind, and his work has found widespread 
application in the engineering and statistical literature (see also, Ref. 
14, pages 517-520 and the references therein). 

In the notation used here, Chernoff showed 

Poi ^ inf H g (P , Pi)e-<*' 

0<q 

Pio S inf H q (P , P 1 )e-c«-«^, 

where L' is the threshold in the log likelihood ratio test. 

Chernoff's original ideas have been generalized in several directions. 
Kraft 27 obtained upper and lower bounds on the total probability of 
error. For some choice of L' (see also Ref. 28) : 

\ min (to, Jri)fff(P , -Pi) ^ Pe ^ (t t 1 )Wi(Po, Pi). 

Hellman and Raviv 29 have also worked on this problem. Shannon, 
Gallager, and Berlekamp 25 obtained lower bounds on the probabilities 
of error of the first and second kind in terms of the logarithm of 
H q (P , Pi), and the first and second derivatives of the logarithm. 

Here the Kakutani inner product plays two key roles, providing a 
check on whether or not singular or perfect detection is possible 
[iff H q (P , Pi) = 0], as well as giving exponentially sharp bounds on 
the performance of the log likelihood ratio test if detection is not singu- 
lar. Since the Kakutani inner product need only be calculated at a 
small number of values of q to accurately numerically approximate 
upper and lower bounds on error probabilities, unlike calculating the 
probabilities of error of the first and second kind from the log likelihood 
characteristic function, this approach may be useful as a practical 
design tool because it is relatively inexpensive. 

The following observations are strightforward exercises : 

(i) When a sequence of N i.i.d. random variables is observed, 
H q (P , Pi) = e~ AN , where A is independent of N, depending 
solely on P , Pi, and q. 
(ii) When P and Pi are absolutely continuous with respect to 
Lebesgue measure, and the corresponding densities are unimodal 
translates of one another, then for fixed q, the larger the separa- 
tion the smaller the inner product H q (P , Pi). 
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The Kakutani inner product H g (P 0] P{) can be explicitly calculated 
for the three analytic cases discussed earlier: 

Gaussian (a = 2, — 1 ^ ^ 1): 

p n (x) = 1/V4rc exp( — £ 2 /4c 2 ) — °o < x < <x> 

H q (P , Pi) = eW>, M (g) = In r p„ 9 (x - rf)pi~ f (* - s°)dx 

J — 00 

Cauchy (a = 1, /8 = 0) : 

p„(a;) = - (z 2 + c 2 ) -1 — oo < a; < co 

7T 
# 9 (P , Pi) = [£ P «(* - «»)pi~«(* - S°)dxY 

L Mx - s ^~ 9(x - s0)dx = | - a&S [ ^ J 

• (1 ( ~^ 2y 2 Pi(g, -2j;q-2j;-l), 

where A = (s 1 - s°)/2. 
From tables (Ref. 30, 263.00) for elliptic integrals: 

#i(Po, Pi) 

-iH(^) ,+i r-[- i -[(^) ,+i ]"T- 

where en -1 ( • , • ) is an inverse Jacobian elliptic function. 
Pearson V (a = \, = - 1) : 

Pn(x) = «^ c V27r\ c/ 

[ x < 

#,(Po, Pi) = [J"" p„ 8 (* - ^pi-'Ca: - s°)dxV- 

The integral could not be expressed in any other analytic form. Since 
Pi is absolutely continuous with respect to Po, but not vice versa, 
H g (Po, Pi) is continuous for q £ (0, lj, and is discontinuous at q — 0. 
Apparently only in the gaussian case does the Kakutani inner product 
or the Hellinger integral reduce to a simple form, and for general stable 
distributions the problem appears to be analytically intractable at 
present. Thus, it seemed worthwhile to investigate numerical methods 
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for approximating the desired integrals. Again it seems important to 
emphasize that an accurate approximation of the log likelihood prob- 
ability density Laplace transform under Hi or H is needed at only a 
small number of choices of g, so the calculations can be quite inexpen- 
sive. In the previous section, the log likelihood characteristic function 
had to be approximated at a great many frequencies, and the resulting 
computation effort and storage made that program relatively more 
expensive. 

4.4 Numerical approximation of performance bounds 

At present, three approaches have been investigated for calculating 
stable probability density functions. The first involves summing power 
series and asymptotic series, 31 the second involves quadrature of an 
integral representation of the density, 32 and the third uses a discrete 
fast Fourier transform of the characteristic function (Ref. 33, pages 
35-42; and Ref. 34). 

The approach used here was a combination of the first and third 
methods. The stable probability density function was approximated 
over its central region via a discrete fast Fourier transform, while 
asymptotic expansions were used outside this region. This approach 
avoids the difficulty of knowing how to merge the power series and 
asymptotic series (see Ref. 31). 

The Kakutani inner product was broken into two integrals. The first 
integral was approximated by a fixed step size Romberg integration 



i.o< » 



0.001 



H = f Vplx-s'lplx s°] d 
p(x)-p(x;a,0 = O) 




0.01 0.1 1.0 10.0 100.0 1000.0 

(¥) 

Fig. 6— Hellinger integral vs (s 1 - s")/c [a = 0.5(0.2)1.9, = 0]. 
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0.001 



0.0001 




300 



Fig. 7— Hellinger integral vs (s' - s°)/c [a = 1.90(0.01)2.00, 0=0]. . 

routine 35 using the discrete fast Fourier transform approximation to the 
density (typically, 4096 points were used). The second integral was 
approximated by a variable step size Romberg integration algorithm 
using the asymptotic expansion for the density. 

While this approach is adequate for finite mean stable distributions 
(1 < a ^ 2), and with care works for 0.5 ^ a =j 1, it is inadequate for 
< a < 0.5, because the expense is too great at present. The reason 
is that for < a < 1, a great many evenly spaced points must be used 
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to adequately approximate the characteristic function in the neighbor- 
hood of the origin (where its derivative is unbounded), as well as at 
other frequencies, and the expense of storing these values (to carry out 
the discrete fast Fourier transform) is prohibitive. One possible ap- 
proach around this problem is to simply use only the series expansion 
(see Ref. 33). 

All results presented here were calculated on a Honeywell 6070 
computer using double-precision arithmetic (14 significant figures) ; the 
estimated relative error in all cases was less than a tenth of one percent. 

Figure 6 shows the Hellinger integral for various parameters 
[a = 0.5(0.2)1.9, = 0] as a function of [(s 1 - s )/^, for N = 1. 
This figure suggests an interesting conjecture, that the Hellinger 
integral is smaller the closer the characteristic index a is to two, all 
other factors being the same. No proof of this is known, at present. 

Figure 7 depicts results of numerically calculating the Hellinger inte- 
gral for various characteristic indices close to two [a = 1.90(0.01)1.99, 
(8 *= 03, for N — 1. The singular nature of the gaussian distribution 
(a = 2) is quite evident when compared with that of a = 1.99 or 
a = 1.98. 

Figure 8 shows n(q) vs q for fixed [(s 1 — s°)/c~]. Again, the closer the 
index is to two, the smaller the inner product. 

Figure 9 presents n(q) vs q for various choices of [(s 1 — sP)/c], and 
fixed characteristic index a and skewness parameter /3; the larger 
(s 1 - s°)/c, the smaller H q (P , PJ. 

4.5 Comparison of the performance of the log likelihood decision rule 
(a = 7.95) with a linear decision rule 

It is interesting to compare the performance of the log likelihood 
decision rule with a linear decision rule, when the observations are 
drawn from a nongaussian stable distribution with characteristic index 
near two. To be explicit, it is assumed the observations are i.i.d. stable 
random variables (a = 1.95, /3 = 0), with ir = ^i = 2 an d s 1 = — s° 
= S chosen for simplicity. The linear decision rule is simply 

N Hi 

E n % 0. 

t=l He 

This sum is a stable random variable, with parameters (a = 1.95 
= 0, Ny, Ns'), assuming Hjtf = 0, 1) is true. The total probability 
of error is equal to the probability of either an error of the first or 
second kind, 

P E — PlO = P 01) 

and can be computed from the series described earlier, or from pub- 
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-3.0 



Fig. 8 — Logarithm of Kakutani inner product H g vs q [_a = 1.1(0.4)1.9, /3 = 0] 
C(s l - s°)/c = 10]. 

lished tables. 31 This is plotted in Fig. 10 as a function of [(s 1 — s°)/c1 
for various N. The same figure includes plots of the Hellinger integral 
upper bound on the total probability of error using the log likelihood 
decision rule. The figure makes it quite clear that the log likelihood 
decision rule, for many cases of interest, has a much much smaller 
probability of error than the linear decision rule. 
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Asymptotically, the total probability of error for the linear detection 
strategy behaves as 

.'. P B ~Ol(S/c)-"N 1 -"l, 



-10.0 



-5.0 



-2.0 



-1.0 



-0.5 



-0.2 



-0.01 




Fig. 9 — Logarithm of Kakutani inner product H q vs q [(s l — s°)/c = 1, 2, 10, 100] 
(a = 1.90, = 0). 
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1x10" 
5x10" 

2x10" 

1x10" 




(OPTIMUM PROCESSING) 
P E (LINEAR PROCESSING) 

H = [ /_°^ y P (x-s')p(x-sO) dx] 



p(x) = p(x;a, 0, 7, 6 
a= 1.95 7 = c a 
fl = 0.0 5=0 



0.3 



1.0 



(^) 



Fig. 10 — Linear processing probability of error and Hellinger integral upper bound 
on nonlinear processing probability of error vs (s l — s°)/c (a = 1.95, = 0). 

while the probability of error for the log likelihood detection strategy 
asymptotically behaves at 

Pe = 0(e-^), 
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where A = A (a, 0, 7, S) > 0, independent of N. This simple asymp- 
totic analysis suggests that the log likelihood decision rule has a much 
smaller probability of error than the linear decision rule, for large N, 
which is borne out in Fig. 10. 

4.6 Comparison of the upper and lower bounds and P B 

It remains to compare the bounds on total probability of error, and 
probabilities of errors of the first and second kind, with the actual 
quantities. None of the bounds employed here are tight, because the 
upper and lower bounds have different exponents. This program is quite 
difficult, and has only been carried out analytically for the gaussian 
case, and numerically for the Cauchy case. The remaining cases can 
be handled numerically following Shannon et al. 25 For simplicity, from 
this point on it is assumed that 7r = 7n = §, s 1 = — s° = s. 

Gaussian (a - 2, -1 ^ S 1) 

Earlier it was shown that 



2 erIC \ 2c / 



Pe = Pxo = P01 = 2 erfc 

This can be upper and lower bounded tightly by (see Ref. 20, eq. 
7.1.13) 

where 



-\ 



Since both K L and K u behave as 0(i\H), Pe ~ e -"«" *<*-<> [latcad^ where 
K u and K L introduce factors of log (N) in the exponent. The Hellinger 
integral bounds are 27 

By inspection, the exponent in the upper bound agrees with the tight 
lower and upper bound exponent [to within a factor of LN(N)~]- The 
Chernoff upper bounds 26 on P10, P01 are 

Pox ^ exp[-W( s / C ) 2 ] _,. 

for some q £ [0, 1J 

or P 10 ^ exp[-tf(l - <7) 2 (s/c) 2 ], 

and for q = \ these exponents agree with the tight upper and lower 
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bound exponents to within a factor on LN(N). The lower bounds 25 are 

Poi > i exp[- JVg 2 (s/c) 2 - g(s/c)V2iV] 

for some g G [0, 1] 
or P 10 > \ exp[-tf(l - <?) 2 (s/c) 2 - (1 - ?)(s/c)V2ATj, 

and for N sufficiently large, the upper and lower bound exponents 
are identical within a factor of 0(iV -i ). 

Cauchy (a = 1, = 0) 

The real and imaginary parts of the characteristic function of the 
Cauchy log likelihood were calculated numerically at 513 evenly 
spaced frequencies starting at v = from a direct numerical quadrature 
of the (complex) integral 



, . f" /. . (z + s) 2 + c 2 \/ c\ 

,(.,) - J_ v exp («, In {x _ s)2 + c2 X ; J 0T= 



d.r 



s) 2 + c 2 ' 
v = kAv, k = 0, •••,512 

using an adaptive, step-size, Romberg, numerical integration algorithm, 
with an estimated error of 10 -10 (all arithmetic was performed in 
double precision). One representative characteristic function is plotted 
in Fig. 11. The stationary-phase asymptotic expression was used for 
frequencies outside of this range. The resulting approximation to the 
characteristic function was multiplied by itself N times, and a numeri- 
cal approximation of the inverse transform of this resulting characteris- 
tic function was calculated, using a fixed, step-size, Romberg algorithm 
for the first 513 frequencies; an adaptive, step-size, Romberg algorithm 
was used for the tail of the inverse transform. The final results are felt 
to be accurate to three significant figures. The results are plotted in 
Fig. 12, along with the Hellinger integral upper bound. Clearly, the 
Hellinger integral upper bound is quite conservative; it is straight- 
forward to check that the Hellinger integral (squared) lower bound is 
too optimistic, from the curves in Fig. 12. 

4.7 Generalizations 

The extensions of the results in this section (as well as the following 
section) to a much wider class of infinitely divisible distributions is 
immediate. Here these extensions are sketched. Elementary arguments 
(Ref. 15, page 540) show that if the Levy measure of an infinitely 
divisible distribution behaves asymptotically as a power, i.e., v(X, oo) 
~ 0(X-»), v{- oo, -X) ~ 0(X-«), then Pr [x > X] ~ 0(X-»), 
Pr [_x < —X~] ~ 0(X~ q ), where p, q > 0. Given a sequence of i.i.d. 
random variables drawn from such a distribution with one of two 
location parameters, it is straightforward to check that results analo- 
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Fig. 11 — Cauchy log likelihood characteristic function. 

PROBABILITY MEASURES— I 1165 



1.0 



5.0x10" 




2.0x10" 



1.0x10" 



aoxio" 



2.0x10" 



5.0 xlO -6 



2.0 xlO -6 



i.o xnr 6 

0.1 



%H (UPPER BOUND) 

P E (OPTIMUM PROCESSING) 

JC,~ * I 1? 

- H [ k J_ooy|(X-S) 2 +C 2 ][(X + S)2 + C 2 )J\\ 

1 



\ 



3.0 

(I) 



10.0 



30.0 



100.0 



Fig. 12 — Log likelihood probability of error and Hellinger integral upper bound 
for Cauchy (a = 1, /3 = 0) samples vs (a/c). 
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gous to those in this section hold: (i) Z(r,-) ~ 0(r f -1 ), (m) the probability 
of an error of the first and second kind, using a log likelihood ratio test, 
is upper bounded by exp( — AN), (Hi) using a simple linear test to 
discriminate between hypotheses, i.e., adding up the observations and 
comparing the sum with a threshold, results in the probability of an 
error of the first or second kind behaving as O(NL'-p), 0(NL'~ q ), and 
choosing L' directly proportional to N (as in the gaussian case) gives 
^01, Pio ~ 0(iV 1 ~ p ), 0(AT 1- «), which is much worse than the perform- 
ance of the log likelihood test in this asymptotic sense. 

V. DISCRETE TIME DETECTION OF STABLE MEASURES WITH 
DIFFERENT SCALES 

In this section, one approach is studied for hypothesis testing of 
different scale parameters; since the ideas are quite similar to that just 
developed, the treatment is much shorter. 

One of two sequences of i.i.d. stable random variables is observed 
(under one of two hypotheses, H and Hi) : 

Ho Tk = s°n k 

The observed or received sequence is denoted { n k } i , where the { n* } f 
are i.i.d. stable random variables with known parameters (a, 0, y — 1, 
5 = 0); both s 1 and s° are known. The a priori probability of H, is Try 
(j = 0, 1). The measures induced by \n k }i under H and H x are 
equivalent for (0 < a ^ 2, — 1 ^ 2* 1) ; it remains to find the 
optimum decision rule, the log likelihood ratio, and characterize its 
performance. 

5.1 Likelihood ratio test 

Before discussing the general case, the three special analytically 
tractable cases are treated. 

Gaussian (a = 2, — 1 ^ /3 ^ 1) : 

p n (x) = -t= e _x2/4 — oo < x < oo ; 

••• A ' = | 1 i W= Jvl "(?)-[(sr) , -(^)]l/«i>' 

The test involves squaring the observations and comparing with a 
threshold; this test is the well-known chi-squared test (see Ref. 5, 
pages 163-173). 
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Cauchy (a = 1, = 0) : 

7>n(z) = - (a; 2 + l) -1 — °° < x < oo ■ 

IT 

•• t(r<) ~ ' %,fr,A°)A° " *" {? ) - h dTW i 

V s 1 / iti r? + (s 1 ) 2 £ 

For |r,| <3C s°, s 1 , Taylor series arguments show l(ri) behaves as r\, just 
as in the gaussian case. However, unlike the gaussian case, where £(r<) 
behaves asymptotically (|r,| ^> s 1 , s°) as 0(H), here l(ri) ~ In (sVs ) 
+ 0(r7 2 ); again, large excursions are soft limited, or essentially 
discarded. 

Pearson V (a = \, = - 1) : 

, x {-<= x~h- ix x > 

Pn(*) = ^ V27T 



.-. l(r t ) = In 



a; < 0; 

Vn{rj/s l )/s x 
Pn(r,/s°)/s 



f-l ln (?)-^- 
I 






Again, large deviations in r< are soft limited or weighted lightly, since 
asymptotically (r,->£> s 1 , s°)Z(r,) behaves as 0(r,~ 1 )- 

The remaining cases can be treated in identical manner using the 
power series and asymptotic series expansions for the stable probability 
density function. For (0 < a < 2, — 1 < < l;a 9* 1), the important 
points are : (i) for | r,- 1 « s°, s 1 , the ith term (0 ^ 0) in the log likeli- 
hood behaves as r,, unlike in the gaussian case, while for = 0, 
/(/•»•) ~ r?, (m) for \n\ ^> s°, s 1 , soft limiting of large deviations is used, 
and the log likelihood's ith term behaves as a In (s 1 /s°) + 0(\ri\~ a ). 
Figures 13 and 14 show representative log likelihood ratios for fixed a 
and varying 0, and fixed with a varying, respectively, computed from 
power series and asymptotic series. 31 

The final case (0 < a < 2, 0=1, or = — 1) must be handled 
with a little more care. Only the case = — 1 is discussed, since the 
other follows immediately. For (1 < a < 2), the first point made above 
is still valid, while the second point is valid only for r,- > 0, r< 2> s°, s 1 . 
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For n < 0, \u\ » s°, s 1 , Z(r,-) behaves as a In (a 1 /* ) + 0(- N 1 ' 1 -"), 
i.e., decreasing with |r,|. For (0 < a < 1), for r t - > 0, r< «; s°, s 1 , the 
t'th terra in the log likelihood behaves as 0(r,~ (1/1-a) ). Finally, for a = 1, 
l(ri) = { - exp[(ir/2) | n | ] } as r,- -> - * . 

5.2 Performance limitations 

The general problem of finding Pjs, P i, and Pi for arbitrary stable 
distributions is still open, both analytically and numerically (because 
of expense). The three special analytic cases are treated here, to point 
out the problems that must be overcome in the general case, if one 
attempts to find the log likelihood probability density by transform 
methods. 

Gaussian (a = 2, — 1 ^ /3 ^ 1) : assuming hypothesis Hj(j = 0, 1) 
true, the Fourier transform of the log likelihood probability density is 

*(«"W -($)"|i -*[($)' -i]p 



B(e-'|ff„) -(*)" (l + *[($)* -l]] 



-iV/2 



- x = A'-N\n 



(5) 



These Fourier transforms can be inverted: 

(/„(A2 \.V/2 

■exp(- w . ( y w . ')/r(y/2) 

(/on 2 \JV72 

■exp(- (sl) ff 8 (s0)2 x)/r(iNr/2) 

p(s|#i) = p(x\H ) =0 a; = A' - JVln(^) < 0. 

Finally, the probabilities of errors of the first and second kind are : 
2 ( UK*)*- (s°yi \ NI2 



>0 



iFi 



($ 



1 + 



JV,L'[(s ) 2 - (s 1 ) 2 ] 



2 ' 



(«°) s 






[(s ) 2 - (s 1 ) 2 ] V' /2 



•i^ 



a 



1 + 



)■' 



N,L'[(s ) 2 - (s 1 ) 2 ] 



\/v(N/2) 
)/r(iV/2) 



Z/ > Win (a /* 1 ) 



2 ' (s 1 ) 2 

Pio = 1, Poi = L' < JV In (rf»/«i). 
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Fig. 13 — Representative log likelihood functions (a fixed, /3 varying). 
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Fig. 13 — (continued) 

Cauc/iy (a = 1, jS = 0) : the log likelihood characteristic function under 
ffiis 

E(e^'\H 1 ) = { p t (£Y [> 2 + WFD* + (s 1 ) 2 ]-'-^}" 

= { (*/*)*»*! [* + i, 1; 1; -l + (5)"]}*, 

where s 1 > s° was assumed. Stationary phase arguments show that the 
characteristic function decays asymptotically as 0([v| -Ar/2 ). Again, the 
Fourier inversion lemma guarantees that the problem of finding P i 
is solved. A similar analysis holds assuming H is true. 

An alternate approach is to compute the Mellin transform of the 
likelihood probability density function ; the results are 

N 



*(A~W-((£)",F.[«-i,i;i;-i + (£)'] 

Unfortunately, it is not clear how to invert this transform to find P i 
and Pio. 



PROBABILITY MEASURES— I 1171 



4 


CHARACTERISTIC INDEX a 
SKEWNESS PARAMETER (3 = 0.0 






3 
2 


a= 1.5 


1 

| 


ft 


1 


- 





-1 


1 1 


H,: r = s 1 n 
H 0: r = s°n 
»'-2.»°-1 

I I 



2 - 



-1 



-2 



CHARACTERISTIC INDEX a 

SKEWNESS PARAMETER = 0.5 

a= 1.9 




30 



-30 -20 

r 

Fig. 14 — Representative log likelihood functions (a varying, /3 fixed). 
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A third approach is to convolve the probability density of the log 
likelihood with itself N times; for N = 2, the convolution involves 
elliptic integrals ; successive convolutions are quite formidable. This was 
not investigated further. 

Pearson V (a = %, = — 1) : the log likelihood characteristic function 
is (assuming now s° > s 1 ) 

B{e**'\Hd = {(sWs )""/^ - *(jj - l)]}"' 

E(e^'\H ) = {(^"("'v/fl - iv(l - £)]}""' 

The log likelihood probability density is 

x = A' - j In («Va°) > 
p(A'\H ) = ( 7 ^y li x <»n-i e -i+-*'">/r(% ) 

p(A'|ffO = p(A'|ffo) =0 A' < 5 In (sVa°). 

The probabilities of errors of the first and second kind are 

*-i-f(^r*(*f +l! ^)/ r (f) 

Z/>£ln(«V«») 

'--*(*^r*.(M+i--* a ^)A(f) 

Pio=l, Poi = L' <^ln(s 1 A°). 

Again, the general problem is still open analytically, because closed- 
form expressions for stable probability density function are unknown 
at present (except for the three cases covered here). The general 
problem is expensive to tackle numerically at present, because of the 
expense of both calculating and storing the characteristic function of 
the log likelihood probability density, and because of the expense of 
repeating these calculations for many different parameter choices. 
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5.3 Analytic performance bounds 

Apparently only in the three special cases does the Kakutani inner 
product reduce to simple expressions. These results are recorded here, 
while Section 5.4 discusses numerical approximations of these integrals 
for various cases of interest. 

Gaussian (a = 2, — 1 ^ /S ^ 1) : 

p n (x) = -j= e~ xVi — °° < x < oo 

Cauchy (a = 1, /3 = 0) : 

p n (z) = i (x 2 + l) -1 - =o < a; < oo 

w«-(/:[^(5)r[^(5)r*i' 

where £[?*(?)]' [?*($)J ' dX 

The Hellinger integral can be evaluated from the tables in Ref. 30 
263.00: 

Pearson V (a = |, = 1) : 

-^L= z-J e -l* a; 2> 
P»(z) = 1 V2tt 

[ a; < 0; 

• • fl fl(/V ^ l} " [ q* + (1 - g) S ° J ' 
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5.4 Numerical approximation of performance bounds 

The methods and checks employed were identical with those used 
in the detection of location for accurately calculating the inner product 
of the two stable probability measures. 

Figure 15 shows n(q) vs q for fixed (s l /s°) and [a = 1.1 (0.4) 1.9, 
= 0]. This raises the conjecture that the closer the characteristic 
index is to 2, the smaller the Kakutani inner product. 

Figure 16 shows m(?) vs q for fixed (a, /J) and various values of 
(s l /s°) : the smaller the (s l /s°), the smaller the H g (P , Pi). 

Figure 17 shows H^Po, Pi) for various (a, 0) as a function of 
(s l /s°) ; note that the case a = 2 does not appear to be singular here. 

5.5 Comparison of performance of log likelihood decision rule 
with a chl-squared test 

How does the performance of the log likelihood test compare with 
that of a chi-squared test, in particular for characteristic index a near 2? 
The chi-squared test involves 

N Hi 

The distribution of any one of the r\ can be found from the series 
described earlier : 

/ 21 rr N l-r^rVnlx = V^J «, ft (*')", * = 0] T< > 

P{n\Hj) = J. 2sWr,- 

I r, < 0. 

The discussion now follows from that in Section 4.6, but is not as 
detailed. Using elementary arguments (Ref. 15, pages 268-272), it can 
be shown that if < a < 2, -1 < < 1, then 

If V is set at a threshold which is a fraction of N, then 

Pj, ~ 0(2V*-<«») ; 

i.e., the probability of error grows with N, the number of observations. 
For comparison, the upper bounds on P n , Pio, and P E for log likelihood 
detection all behave as 0(e- AN ), where A depends on (a, ft s\ and s°). 
Thus, the log likelihood test is asymptotically far superior to the chi- 
squared by the above argument. 
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Fig. 15 — Logarithm of Kakutani inner product H q vs q [a = 1.1(0.4)1.9, /3 = 0] 
(sP/s*) = 16. 

VI. DISTINGUISHING STABLE PROBABILITY MEASURES WITH DIFFERENT 
CHARACTERISTIC INDICES AND SKEWNESS PARAMETERS 

For completeness, this section touches on the form the log likelihood 
test takes for discriminating between stable distributions with different 
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characteristic indices and with different skewness parameters. Per- 
formance of this test will not be covered here; much of the earlier 
discussion on performance is applicable here. A table in the Ap- 
pendix summarizes the behavior of l(ri) both asymptotically and for 
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Fig. 16 — Logarithm of Kakutani inner product H„ vs q [(a = 1.90, = 0), 
(s°/V) = 1, 4, 8, 16]. 
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Fig. 17— Hellinger integral vs (s^s ) [a = 0.7(0.2)1.90, = 0]. 

|r,-| « 1, and includes both the results in the Sections 5.4 and 5.5 as 
well as the results of this section. 

One of two sequences of i.i.d. stable random variables with known 
parameters is observed. In Section 6.1, the parameters are (a', 0, y = 1, 
6 = 0), where < a < a 1 ^ 2; in Section 6.2, the parameters are 
(a, 0', 7 = 1, 5 = 0), where -1 < 0° < 1 ^ 1 (recall ; = 0, 1). The 
special case (a = 1, \0\ = 1) is covered in the table in the Appendix 
but not in the discussion here. 

6.1 Distinguishing different characteristic indices 

For — 1 < j8 < 1, the measures P and Pi are equivalent, so the log 
likelihood ratio is always finite. The log likelihood test is 



N Hi 

A' = L l(r t ) % V, 

t = l Ho 



where 



l(n) = In 



p n (ri;a l ,f3,y =1,8 = 0) 
p»(r,j a°, 0, 7 = 1,6 = 0) 
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Two cases arise : symmetric (0 = 0) and asymmetric (0 9^ 0, — 1 < 
< 1) stable distributions. For the symmetric case, the distributions 
are symmetric about their unique mode, and thus l(ri) ~ |r?| for 
|r,| « 1. For the asymmetric case, the modes no longer coincide, and 
Z(r,) ~ ti for |r,| «. 1. Recall that for 1 < a < 2, for fixed skewness 
ft (fi < 0) the mode decreases as a decreases; for < a < 1, the op- 
posite is true. Thus, l(ri) is the difference of two unimodal functions 
and, in general, should have two points of zero slope. For |r,-| 2> 1, 
l(r%) = 0( — In |r,|), so large deviations are weighted quite strongly. 
Note the log likelihood distribution has its support on whole line, un- 
like the two previous sections, except for (0 < a < 1 ^ «i ^ 2, 

1*1 = D- 
For = — 1, and 1 < a < a 1 < 2, the measures Po and Pi are 

equivalent, and the above discussion follows immediately with one 

exception: for r, » 1, l(ri) = 0( — lnr,), while for |r,| >$> 1, r< < 0, 

l(rd = Odr,! 00 ' 00 - 1 ). 

For = — 1, < a < a 1 < 1, the measures P and Pi are equiva- 
lent. For r, > 0, |r,|«l, /(r.) ~ r? o/1 - ao , while for r, » 1» 
l(r t ) =0(-lnr,). 

Finally, for = -1, < a° < 1 < a 1 < 2, the measures P and 
Pi are neither equivalent nor mutually orthogonal. For r.O?> 1, 
l( ri ) = 0(-lnr,), while for r, < 0, l(vi) = «. For r< > 0, r, « 1, 
Z(r<) = 0(r? o/a "'- 1 ). 

6.2 Distinguishing different skewness parameters 

For — 1 < 0° < l < 1, the measures P and Pi are equivalent, so 
the log likelihood ratio is finite. The discussion follows that of Section 
6.1 exactly, with the difference that if r, » 1, Z(r.) = In (Ri/R ) 
+ 0(rr a ), while if \n\ » 1, r, < 0, Z(r.) = In (Li/L ) + 0(|r,|- a ).* 

For -1 = 0° < l < 1, 1 ^ a < 2, the measures P and P x are 
equivalent. For \n\ » 1, r, < 0, Z(r<) = Odr.h'"- 1 ), while for r,- » 1, 
l(n) = In (fti//2o) + 0(rD. 

For -1 = 0° < l < 1, < a < 1, the measures P and Pi are 
neither equivalent nor mutually orthogonal. For r.O£> 1, Z(r,-) 
= In (fti/flo) + 0(rr a ), while for < r, « 1, Z(r.) = OCrf/ " 1 ). For 
> r { , l(ji) = co . 
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* See the Appendix for definition of constants Rj, Lj(j = 0, 1). 
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APPENDIX 

Asymptotic Behavior of Log Likelihood Ratio 
A.1 Location (s) 

l(x) = In ^""'"• y ' , j\ i. < li 

X — > + oo x — ► — oo 

a = 2 0(x) 0(x) 

0<a<2, — 1 < /3 < 1 0(ar*) 0(x- 1 ) 

1 < a < 2,0 - -1 OCar 1 ) 0(- Izl 1 '*- 1 ) 

a = 1,0 = -1 Oix- 1 ) 0(- e W2)\'-ii\) 

X — > +co x J, Si 

<a <1,B = -1 Oix- 1 ) 0(-(a;-8i)«'*- 1 ). 

A.2 Sca/e (c) 

i(x; = In —7 3 r— : pyT Co < Ci 

p(s;a,0, 70 = c?, 5 = 0) 

x — > + 00 a; — > — 00 

a = 2 0(z 2 ) 0(z 2 ) 

< a < 2, -1 < < 1* a In (ci/c ) + 0(ar") a In (ci/c ) + 0(\x\- a ) 

Ka<2,0=-1 aln(ci/c )+0(ar«) 0(— Ixj '/'*- 1 ) 

a = 1, = -1 a In (ci/c„) + 0(x" a ) 0(-«M»W*l) 

x-» +00 x J, 

0<a<l,/3=-l a In (ci/co) + 0(ar« ) OC-z"'"- 1 ). 

A.3 Characteristic index (a) 

. . . p(x; ai, jfl, 7 = 1, 8 = 0) n . ^ „ 

Z(x) = In ^7— - — '-^ ^-z ^ , < a < a x ^ 2 

p(x;a , B, 7 = 1, 5 = 0) ' 

x — > -j- 00 x—*— * 

< a < ai = 2, -1 < 8 < 1 0(-rc 2 ) 0(-Z 2 ) 

< a < ai < 2, -1 <0 < 1 O(-lnx) 0(-ln \x\) 

1 < a < ai = 2,3 = -1 0(-x 2 ) 0(|z|°° /a °- 1 ) 
1 < a < ai < 2,0 = -1 O(-lnar) 0(|x| ao/a «>- 1 ) 
1 - a < «i = 2, j8 = -1 0(-x 2 ) (e<'/ 2 >i-") 

1 = a < a x < 2, j8 = -1 O(-lnx) Q( e WW\) 

x-+ +00 z J, 

< a < 1 < «i < 2, = -1 O(-lnx) 0(x aola <>- 1 ) 

< a < ai = 1,8 = -1 O(-lnx) 0(x a0 ' a0 - 1 ) 

< a < ai <1,8 = -1 O(-lnx) Oix" '" - 1 ). 



* This excludes the Cauchy (o — 1, — 0), which was examined in the text as a 
special case. 
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A.4 Skewness (j3) 

, ln p(»;a,fa7-M-0) _ lgl8 . <j8lS i 

p(x;a,0 Oj 7 = 1, ^ = 0) 

X — -> -j- oo X— > — oo 

< a < 2, -1 <0 O <0i < 1 In (Bi/Bo) + 0(x~ a ) In (Li/L«) 

+ 0(|*|-) 

1 < a < 2, -1 = /3 <0i < 1 In (Ri/R ) + 0(ar«) 0(|x| a/ — 
a = 1, -1 - O < ft < 1 In (Ri/Ro) + O(ar-) O(e^^i) 
1 < a < 2, -1 = £„, 1 =^i CK-a"/"- 1 ) OOxl-'— 1 ) 
a = 1, -1 = 0o, 1 = 0i 0(-«M»l*») 0(eW»l*l) 

X— > +oo XJO 

< a < 1, -1 = 0o < 0x < 1 In (fii/fio) + 0(ar«) 0(x a/a - 1 ) 



fii = sin 75 (0, - a), tan (*0j/2) = 0,- tan (?ra/2) 

4S 



L, = sin £ (0,- - a), tan (*0//2) = -0,- tan (to/2). 



3 - 0, 1 
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